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Summary 



Consider a one-way analysis of covariance model. Suppose that the parameter of 
interest 9 is a specified linear contrast of the expected responses, for a given value 
of the covariate. Also suppose that the inference of interest is a 1 — a confidence 
interval for 9. The following two-stage procedure has been proposed to determine 
the form of the model. In Stage 1, we carry out an F test of the null hypothesis that 
the slopes are all zero against the alternative hypothesis that they are not all zero. If 
this null hypothesis is accepted then we assume that the slopes are all zero; otherwise 
we proceed to Stage 2. In Stage 2, we carry out an F test of the null hypothesis 
that the slopes are all equal against the alternative hypothesis that they are not 
all equal. If this null hypothesis is accepted then we assume that the slopes are all 
equal; otherwise this assumption is not made. We present a general methodology 
for the examination of the effect of this two-stage model selection procedure on the 
coverage probability of a subsequently-constructed confidence interval for 9, with 
nominal coverage 1 — a. This methodology is applied to a numerical example for 
which it is shown that this confidence interval is completely inadequate. 

Key words: confidence interval; coverage probability; F test; one-way analysis of 
covariance; preliminary hypothesis test. 
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1. Introduction 

Consider the one-way analysis of covariance model 

^ij d% ~\~ bi (yX^j x) ~\- S{j (1) 

where is the response of the j th experimental unit (j — 1, ...,rij) receiving treat- 
ment i (i = 1, k), when the covariate takes the value Xij. The are independent 
and identically N(0, a 2 ) distributed, where a 2 is an unknown positive parameter. 
The aj and the slopes 6j are unknown parameters. Suppose that the parameter of 
interest 9 is a specified linear contrast of the expected responses, for a given value 
of the covariate. Also suppose that the inference of interest is a 1 — a confidence 
interval (CI) for 9. 

Milliken & Johnson (2002, Section 2.3) propose the following two-stage procedure 
to determine the form of the model. In Stage 1, we test the null hypothesis that 
the slopes hi are all zero against the alternative hypothesis that they are not all 
zero. This test is carried out using an F statistic. If this null hypothesis is accepted 
then we assume that the slopes Oj are all zero; otherwise we proceed to Stage 2. 
In Stage 2, we test the null hypothesis that the slopes hi are all equal against the 
alternative hypothesis that they are not all equal, which is also tested using an F 
statistic. If this null hypothesis is accepted then we assume that the slopes fej are 
all equal; otherwise this assumption is not made. 

Our aim is to examine the effect of this two-stage model selection procedure on 
the coverage probability (CP) of a subsequently constructed CI for 9, with nominal 
coverage 1 — a. This confidence interval is constructed on the assumption that the 
model selected by this two-stage procedure had been given to us a priori as the true 
model. This assumption is false and it may lead to a CI with very poor coverage 
properties. 

We present a general methodology for this examination in Sections 3 and 4. 
Kabaila & Farchione (2012) present a method (using numerical evaluation of mul- 
tiple integrals) for evaluating the CP of a CI for a scalar parameter constructed 
after a single preliminary F test, in the context of a linear regression model. This 
method does not extend to the present case of two preliminary F tests. We there- 
fore need to use Monte Carlo simulation to estimate the CP of the CI for 9, with 
nominal coverage 1 — a, constructed after this two-stage model selection proce- 
dure. In Section 3, we provide a simplified expression for the CP of this CI. Let 
(3 = (ai, . . . , a*;, bi, . . . , 6fe). It follows from this simplified expression that this CP is 
a function of 7 = (3/ a. Further, we show that this CP is a function of the parameter 
vector (7fc+i, . . . , 72^) = (b±/a, . . . , b^/a). In Section 4 we describe a new simulation 
method, using variance reduction by conditioning, for computing this CP. 
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In Section 2 this methodology is applied to an example (with number of treat- 
ments k — 3) which is used by Milliken & Johnson (2002) to illustrate their two-stage 
procedure for determining the form of the model. We suppose that the parameter of 
interest 9 is the difference between the expected responses of two subjects receiving 
the treatments 1 and 2, for the same specified value of the covariate. We consider 
the CI for 6, with nominal coverage 95%, constructed after this two-stage procedure. 
For both of the F tests used in this two-stage procedure, the significance level was 
chosen to be 10%. The minimum CP of this CI is approximately 0.44, showing that 
it is completely inadequate. Furthermore, as illustrated by Figures 1 and 2, the 
CP of this CI is far below 0.95 for a wide range of centrally- located values of the 
parameter vector (74, 75, 7 6 ) = {pi/ a, . . . , 63/a). 

2. Numerical illustration for data taken from Milliken &; 

Johnson (2002) 

The data provided by Milliken & Johnson (2002) "were generated to simulate 
real world applications that we have encountered in our consulting experience" . In 
this section, we consider data that is taken from Chapter 3 of Milliken & Johnson 
(2002). This data concerns the comparison of the effectiveness of three exercise 
programs (treatments) on the heart rate of males with ages in the range from 28 
to 35 years. A total of 24 males within this age range were chosen and eight males 
were randomly assigned to each of the three treatments labelled 1,2 and 3, so that 
k = 3. Since the aim was to compare exercise programs at a common initial resting 
heart rate, the initial heart rate of each of the subjects was used as a covariate. 

In their illustrative analysis of this data, Milliken & Johnson (2002) begin with 
the one-way analysis of covariance model ([1]) and perform the two-stage procedure 
(described in the introduction) to determine the form of the model. We suppose 
that the parameter of interest 9 is the difference between the expected responses of 
two subjects receiving treatments 1 and 2, for the same value x* of the covariate. 
We consider the CI for 6, with nominal coverage 95%, constructed after this two- 
stage procedure. For both of the F tests performed in the two-stage procedure, the 
significance level was chosen to be 10%. 

Let Y* and Y 2 denote the responses of two subjects receiving treatment 1 and 
2, respectively, for the same value x* of the covariate. That is 

Y* = ai + hi (x* — x) + e\ 
Y 2 * = a 2 + b 2 {x* - x) +e* 2 

where el and e 2 are N(0, a 2 ) distributed. We define 

6 = E(Y*) - E(Y*) = ai -a 2 + {bi - b 2 ) (x* - x) 
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That is 6 = a T j3, where a — (l, — 1, 0, (x* — x) , — (x* — x) ,6). In this example we 
chose (x* — x) such that | x* — x \ — maximum of all the | x*j — x | values. 

As we will show in Section 3 and Appendix C (and as already noted in the intro- 
duction) the CP of the CI constructed after the two-stage procedure is a function 
of (74,75,76), where 7 = (3/ a (/3 = (01,02,03,61,62,^3))- A search over the entire 
parameter space for the minimum CP of this CI is nearly impossible. In Appendix 
D, we provide details of how we restrict the scope of this search, so that it be- 
comes feasible. As shown in this appendix, the minimum CP of this CI is achieved 
for (74,75,76) G [— 0.25, 0.25] 3 . Therefore, in the present section, we restrict our 
analysis of this CP function to (74,75,76) G [— 0.25, 0.25] 3 . 

We estimated the CPs for a grid of values of (74,75,76) G [— 0.25, 0.25] 3 , using 
M = 10000 simulation runs for each paramater value. When these estimated CPs 
are plotted using a 3-D Scatter plot, it is observed that the minimum CP is approx- 
imately 0.44 and that the CP is small for values of (74,75,76) that lie close to two 
parallel straight lines in the 3-D space of parameters. The equations of these two 
lines were found by fitting linear regression lines to the parameter values that gave 
estimated CPs less than 0.6. The fitted equations for the two lines were found to 
be as follows. 

Line 1 : 74 = c, 75 = 0.088 + c, 7 6 = 0.041 + c 
Line 2 : 74 = c, 75 = -0.088 + c, 7 6 = -0.041 + c 

where -0.25 < c < 0.25. Then the CPs were re-estimated using M = 10000 
simulation runs for each of a grid of parameter values on these two lines, and plotted 
in Figures 1 and 2. Figure 1 is a plot of the estimated CP for parameters (74, 75, 7 6 ) 
on Line 1. Figure 2 is a plot of the estimated CP for parameters (74, 75, 76) on Line 
2. From Figure 1, the minimum CP on Line 1 is estimated to occur at c = —0.0453 
i.e. at (7 4 ,7 5 ,7 6 ) = (—0.0453,0.0427,-0.0043). This minimum was estimated to 
be 0.4384, with standard error 0.0035. From Figure 2, the minimum CP on Line 2 
is estimated to occur at c = 0.0468 i.e. at (74,75,75) = (0.0468,-0.0412,0.0058). 
This minimum was estimated to be 0.4385, with standard error 0.0035. Thus the 
minimum CP is, to a good approximation, 0.4385. In other words, the CI for 9 have 
minimum CP far below 0.95, showing that it is completely inadequate. Furthermore, 
as illustrated by Figures 1 and 2, the CP of this CI is far below 0.95 for a wide range 
of centrally-located values of the parameter vector (74,75,75). The fact that, for 
this example, the lowest CPs lie close to two parallel straight lines is investigated 
further in Appendix A. 
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Figure 1: Plot of the estimated coverage probability for the parameter vector 
(74, 75, 76) on Line 1 : 74 = c, 75 = 0.088 + c, 7 6 = 0.041 + c 
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Figure 2: Plot of the estimated coverage probability for the parameter vector 
(74,75,76) on Line 2 : 74 = c, 75 = -0.088 + c, 7 6 = -0.041 + c 
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3. Simplified expression for the coverage probability of the 

confidence interval for 9 



In this section we consider the general situation described in the introduction. 
The CI for 9 has three different forms, depending on the model resulting from 
the two-stage procedure. To find an expression for the coverage probability of this 
confidence interval, we use the law of total probability (cf Section 2 of Giri & Kabaila, 
2008). This coverage probability is a function of the 2k + 1 dimensional parameter 
vector ((3, a 2 ). By dividing by a in the appropriate way, we show that this coverage 
probability is, in fact, a function of 7 = 13/ a. Then we prove Theorem 1, which states 
that this CP is a function of the parameter vector (7^+1, • • • , 12k) = (bi/cr, . . . , b k /a). 
This reduces the dimensionality of the parameter space over which we search for the 
minimum CP. 

For the model (1) that we consider, there are k treatments with rii experimental 
units allocated to the i th treatment, so that the total number of measurements of 

k 

the response is n = We express the model as Y = X/3 + e, where Y = 

i=i 

(Yn, . . . , Y lni , . . . , Yfei, . . . , Yfc„J, X is an n x 2k design matrix, (3 = . . . , /3 2k ) = 
(a 1 ,...,a k ,b 1 ,...,b k ), and e = (e u , . . . , e lni , . . . , e kl , . . . , e knk ). Let $ denote the 
least squares estimator of (3. Also let S 2 = (Y — X/3) T (Y — Xp)/m, where m = 
n-2k. 

The preliminary F tests are taken to follow the two-stage procedure described 
in the introduction. Accordingly, the null hypothesis in Stage 1 is H 0t : b\ = b 2 = 
• • • — b k — 0. Let r = C T J (3. Here, C T — [ j I& ], where is the k x k zero matrix 
and Ife is the k x k identity matrix. In other words, C T is the k x 2k matrix defined 
such that C T T (3 = (6 l5 . . . , b k ) = (/3 k +i, ■ ■ ■ 1 @2k)- Thus the test can be re-expressed 
as H 0t : t = against H lT :r^0. The F test for testing this hypothesis has the 
following test statistic 

K = tm\ (f/a) T V 22 - l (f/a) 

where f = C T T /3, and V 22 = (l/cr 2 )Cov(f) = C T T (X T X)- 1 C T . The null hypoth- 
esis H 0t is rejected if F T > £ T (accepted otherwise). 

The null hypothesis in Stage 2 is H ^ : b-i — b 2 — • • • — b k . We let £ = C^ J f3. 
Here C% = [ j 1 j — I(fe-i) ], where is the (k — 1) x k zero matrix, 1 is the 
(k — 1) vector of l's and I^-i is the (k — l)x(k — 1) identity matrix. In other words, 
C$ is the (k — 1) x 2k matrix defined such that (3 = (61 — b 2 , . . . ,61 — b k ) = 
(Pk+i — fik+2-, ■ ■ ■ , Pk+i — fok)- Hence the test is re-expressed as H ^ : £ = against 
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H\£ '■ $, 7^ 0. The F test for testing this hypothesis has the following test statistic 



Ft = 



m 



k-l 



€/*) w, 



22 



mS 2 jo' 



where | = Q T /3, and W 22 = (l/<J 2 )Cov(£) = Q T (X T X)- 1 Q. The null hypoth- 
esis if £ is rejected if F^ > (accepted otherwise). 

The following three events form a partition of the sample space Q, induced by 
this two-stage procedure: 

A = {u e Q : F T (u) < £ T } 

B = {u e n : F T (w) > t T , F ( (w) < £J 

C = {u e Q : F T (u) > £ T , F((w) > £J 

The parameter of interest is the linear contrast 9 = a T (3. Let 6 = a T (3. Let 
Vu = Wu = (l/a 2 )Var(0) = a T (X r X)- 1 a, v 21 = (l/a 2 )E((f - r)(0 - 9)) = 
C T T (X T X)- 1 a and w 21 = (l/<r 2 )E((£ - £)(6 - 9)) = C $ T (X T X)- 1 a . Also 
let T denote the value of (3 that minimizes R{(3) = (Y — X(3) T (Y — Xf3) when 
r = 0. As is well-known, $ T = G T {3, where G T = I - (X 1 'X)- 1 C T V 22 ~ 1 C T 1 '. 
When r = 0, the standard 1 — a confidence interval for 9 is 



It 



a T $ T ±t(m + k)> 



m + k 



where v* = vn — v 2 i T V 22 1 t>2i and t(r) is the quantile defined by Pr (T < t(r)) = 
1 — a/2 for T ~ t r . Similarly, let {3$ denote the value of (3 that minimizes R((3) 
when | = 0. As is well-known, (3 $ = G $ j3, where G$ = I - (X T X)- 1 C i W 22 - 1 C i T . 
When £ = 0, the standard 1 — a confidence interval for 9 is 



a T $^ ± t(m + k 



R0i) 

m + k — 1 



where w* = «j u — w 21 T W 22 1 w 21 . The standard I — a confidence interval for 9, 
when fitting the full model to the data is 



I = 



a T f3±t(m) v ^£ 



The CI for 9, with nominal coverage I — a, constructed after the two-stage model 
selection procedure is given by the following expression. 

{I t (uj) if u G A 
I{u) \iueC 
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Therefore, using the law of total probability the CP of this CI can be expressed as 

Pr (9 G CI) = Pr (9 G 7 T , A) + Pr (9 G 7 6 B) + Pr (9 G 7, C) (2) 

A simplified expression for this CP is obtained by substituting the simplified expres- 
sions, presented and derived in Appendix B, for the events {9 G I T }, {9 G 7^} and 
{9 G 7}. This simplified expression implies that this CP is a function of 7 = /3/cr. 

Theorem 1. The CP of the CI resulting from the two-stage procedure is (for given 
design matrix X) a function of the parameter vector (7^+1, . . . , 72/c)- 

The proof of this theorem is provided in Appendix C. 

4. New simulation method for estimating the CP of the CI 

for 

In this section we describe a new simulation method for estimating the CP of the 
CI for 9, with nominal coverage I— a, constructed after the two-stage model selection 
procedure. This new method uses variance reduction by conditioning. Variance 
reduction by conditioning is described, for example, on p. 629 of Ross (2000). 

Let Q = f/a and D = mS 2 /a 2 . In Appendix E, we provide expressions for the 
following conditional probabilities: 

p T (q, d) = Pr (9 G 7 T , A | Q = q, D = d) 
Pt(q,d) = PT(9eIs,B \Q = q, D = d) 
p(q, d) = Pr (9 G 7, C | Q = <?, D = d) . 

The CP of the CI for 9 is 

E (p T (Q, D)) + E (p € (Q, D)) + E (p(Q, D)) . (3) 

We could estimate this CP by adding simulation estimates of each of the terms 
making up this sum. Obviously, © is equal to E(p T {Q, D) + p s (Q, D) + p(Q, D)) . 
Thus, an alternative simulation estimate of this CP is the sample average of M 
independent observations of p T (Q, D) + p%(Q, D) + p(Q, D). In the context of the 
example described in Section 2, this is the more efficient simulation method. 
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5. Discussion 



The literature on the effect of preliminary model selection (using, for example, 
hypothesis tests or minimizing a criterion such as AIC) on CIs is reviewed by Kabaila 
(2009). It is commonly the case that preliminary model selection has a detrimental 
effect on the CP of these CIs. However, each case (specified by a model, a model 
selection procedure and a parameter of interest) needs to be considered individually 
on its merits. 

In the present paper, we consider the two-stage model selection procedure pro- 
posed by Milliken & Johnson (2002, Section 2.3) in the context of a one-way anal- 
ysis of covariance model. This procedure involves the use of two F tests. We 
present a general methodology for examining the effect of this procedure on the 
CP of a subsequently-constructed CI for a specified linear contrast of the expected 
responses, for a given value of the covariate. This general methodology has the fol- 
lowing two components. The first component is a theorem that states that this CP 
is a function of a A;-dimensional parameter vector, rather that a (2k + 1) -dimensional 
parameter vector (as one might initially suppose), where k is the number of treat- 
ments. This increases the feasibility of examining the coverage probability function 
closely, including (a) finding its minimum and (b) finding those parts of the param- 
eter space where it is far below nominal. The second component is a new simulation 
method, using variance reduction by conditioning, for computing this CP. Although 
the derivation of this simulation method is complicated, it brings important benefits 
in the form of increased simulation efficiency This general methodology extends in 
the obvious way to any two-stage model selection procedure that uses two F tests 
(in a similar way to that described in the introduction) in the context of any lin- 
ear regression model with independent and identically normally distributed random 
errors. 

We have applied this general methodology to data taken from Chapter 3 of 
Milliken & Johnson (2002), where the difference of expected responses for treatments 
1 and 2 is, for a given value of the covariate, specified as the parameter of interest. 
We have shown that the CP of the CI for this contrast is far below nominal for a 
wide range of centrally-located parameter values. This throws doubt on the utility 
of the two-stage model selection procedure proposed by Milliken & Johnson (2002, 
Section 2.3). 
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Appendix A: Influence of a on the locus of values of 
(74575576) for which the CP is small 



The locus of values of (74,75,76), for which the CP is small, was found to de- 
pend mainly on the linear contrast 9 under consideration. We see this by giv- 
ing different values to a, the vector of contrast coefficients. For example, when 
a = (1, 0, 0, (x* — x) , 0, 0), the lower estimated CPs are found lie close to two paral- 
lel planes that are parallel to the plane that includes the 75 and 7 6 axes. Also, when 
a = (0, 1, 0, 0, (x* — x) , 0), the lower estimated CPs lie close to two parallel planes 
that are parallel to the plane that includes the 74 and 7 6 axes. 



Appendix B: Simplified expressions for the events {0 G I T }i 

{6 E JJ and {6 £ /} 

As in Sections 3 and 4, let 7 = (3/cr, Q = r/er and D = mY? /a 2 . We obtain a 
simplified expression for the event {9 G I T } as follows. It follows from /3 T = G>/3 
that a T (3 T /a = a T G T */. Since R0 T ) = R0) + - $ T ) T X T X0 - $ T ) = 
mt 2 + f T V 22 - 1 f, 

R0 T )/a 2 = D + QV 22 l Q. 

Thus 



{9 e I T } = {9 jo e I T /a} 



a T (3/a e 



a T (3 T /(T ±t(m + k)> 



m + k 



a T 7 E 



a T G T j±t(m + k)' 



D + Q T V 22 1 Q 

m + k 



The following simplified expressions for the events {9 G 1^} and {9 G /} can be 
obtained using a similar method. Let U — [ 1 j — I fc _i ], where 1 is the (k — 1) 
vector of 1 s and I fc _! is the (k — 1) x (k — 1) identity matrix. Therefore | = Ur 
and hence i/cr = UQ. 



{9 g /a 



{9el} 



a 7 G 



a 1 7 G 



a T G^j ± t(m + k- 1)' 



T „ , , D , 

a 7 ± t(m)\ — Jvn 
V m 



D + Q T U T W 22 - 1 UQ 

m + k — 1 



(4) 
(5) 
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Appendix C: Proof of Theorem 1 

It follows from ([2]) and the simplified expressions for the events {8 G I T }, {8 G 1^} 
and {8 G /} given in Appendix B that the CP of the CI resulting from the two-stage 
model selection procedure is a function of 7. In the present appendix, we prove that 
Pr {8 G I, C) is a function of (7^+1, . . . , 72*) • In the same manner, it can be proved 
that Pr (8 G I T , A) and Pr (8 G 1%, B) are also functions of (7^+1, ■ ■ ■ , 72*)- It follows 
from fl2]) that the CP is also a function of (7fc+i> ■ ■ ■ > 72/0- 

The occurrence or otherwise of the event C = {F T > £ T , > is determined 
by the statistics F T and F%, defined in Section 3. Note that F T is a function of f/a 
and mil 2 /a 2 and F^ is a function of £/cr and mY?ja 2 . As in Appendix B, Q = 
t/ct = (7fc+i, . . . ,72fc), i/cr = UQ = (% +1 -% +2 , . . .,%+i-j2k) and D = mt?ja 2 . 
Therefore, F T and F% are functions of (Q,D). Thus, occurrence or otherwise of the 
event C is determined by the random quantities 

{lk+i , • • • , 72/0 and D. 

In other words, the occurrence or otherwise of the event C is determined by the 
quantities 

(7fc+i, . . . ,7 2 fc), ((7fc+i - 7fc+i), • • • , {l2k - 72fe)) and D. 

It follows from (j^J) that the occurrence or otherwise of the event {8 G /} is deter- 
mined by the random quantities 

7 — 7 and D. 

Therefore, the occurrence or otherwise of the event {8 G 1} fl C is determined by 
the quantities 

(7 fc+ i, . . . ,72*0,7 - 7 and D. 
Since 7 — 7 and D are independent random vectors with 

7 - 7 = 09 - ~ iV(0, (X T X)- X ) 

and D has a x„ distribution, Pr (# G /, C) is (for given design matrix X) a function 
of (7 fe+ i, . . . ,72 fe ). 

Appendix D: Search for the minimum coverage probability 

The CP of the CI described in Section 2 is a function of (74,75,76)- It is very 
difficult (if not impossible) to carry out a computational search for the minimum 
CP of this CI over (74,75,76) G M 3 . To carry out this search, we need to restrict 
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the scope of this search. The two purposes of this appendix are to (a) describe the 
method used to restrict this search and (b) report the result of carrying out this 
restricted search for the minimum CP. 

Our restriction of this search is based on the following simple result. 

Lemma 1. For any events S andT , < Pr(S) — Pr(S C\T) < Pr(T°). Consequently, 
if Pr{T) is close to 1 then Pr(S C\T) is close to Pr(S). 

According to ([2]), the coverage probability Pr(# G CI) is equal to 



Pr(<9 G I T , F T < £ T ) + Pr(fl G J ? , F T > £ T , F ( < £ ( ) + Pr(9 G /, F r > £ T , F ( > ^). 



for any value of (74,75,76) outside this cube. In other words, for any value of 
(74,75,76) outside this cube, the computation of the minimum CP can, to a very 
good approximation, be based on the assumption that the model selection procedure 
consists only of the second F test. 
We now use the following result. 

Lemma 2. // the model selection procedure consists only of the second F test then 
the CP of the subsequently constructed CI for 9 is a function of (75 — 74, 76 — 74) . 

For the sake of brevity, we omit the proof of this result. It can be shown that 
Pr(i<£ > « 1 for any value of (75 — 74,76 — 74) outside the square [— 0.2,0. 2] 2 . 
By Lemma 1, if the model selection procedure consists only of the second F test 
then the CP of the subsequently constructed CI for 9 is close to 1 — a, for any value 
of the parameters (75 — 74, 7 6 — 74) outside this square. 

Our conclusion is that we may search for the minimum CP of the CI described in 
Section 2 as follows. Let mini denote the estimate of the CP of this CI minimized 
over (74,75,76) G [— 0.25, 0.25] 3 . Also, let min 2 denote the estimate of the CP of 
this CI for 74 = 1000 (so that Pr(F T > £ T ) « 1) and (75-74,75-74) e [-0.2, 0.2] 2 . 
Then, our estimate of the minimum CP of the CI described in Section 2 is the 
smaller of minx and mini- Using this procedure, with M = 10000 simulation runs 
for each parameter value, we found that mini — 0.4385 and min 2 = 0.5175, so that 
the minimum CP of the CI described in Section 2 is estimated to be 0.4385. The 
minimum CP is achieved for (74,75,76) G [— 0.25, 0.25] 3 . A detailed description of 
this CP function for (74,75,75) G [— 0.25, 0.25] 3 is provided in Section 2. 



It can be shown that Pr(F T < £ T ) 1, for any value 
[—0.25, 0.25] 3 . It follows from flO]) and Lemma 1 that 




Pr(fl G CI) w Pr(9 G I ( , F ( < £ ( ) + Pr(# G I, F ( > £ ( ), 



(7) 
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Appendix E: Derivation of convenient expressions for the 
conditional probabilities described in Section 4 

As in Appendices B and C, let Q = f/cr, D = mE 2 /a 2 . Also, let $ denote 
the N(0, 1) distribution function. The test statistics F T and are both functions 
of (Q,D). In this appendix, we make this explicit by writing F T = F T (Q,D) and 
Fs = Fs(Q,D). 

The following are convenient expressions for the conditional probabilities p T , p^ 
and p described in Section 4. 

• Let e T = t(m + k)yj (d + q T V 2 2~ 1 q) /(m + k)Vv*. Note that 
p T (q,d) = 

^(^(v 2 i T V 22 - 1 (r/a) + e T ) / V^j - $ ( (v 21 T V 22 \rla) - e T ) / V^j 
if F T (q, d) < £ T ; otherwise p T (q, d) = 0. 



Let e ? = t(m + k - l)y (d + q T U T W 22 ' 1 Uq) j(m + k- l)Vw* and s 21 = 
v 2 i - C T J (X J X)- 1 C i W 22 ~ 1 w 2l . Note that 

Pt(q,d) = 

<$>(^{w 21 T W 22 -\£/a) + s 21 T V 22 -\r/a - q) + e ( ) j V w* ~ s 21 T V 22 - 1 s 2 ^j 

(^{w2i T W 22 -\£/ a) + s 21 t V 22 -\t/ a -q) - e ? ) / V w* ~ s^V^ 1 s 2 ?j 

(8) 

if F T (q, d) > £ T and F$(q, d) < otherwise p$(q, d) = 0. 



• Finally, let e = t(m)^d/rriy/vii and note that 
p{q,d) = 

®((v2i T V22- 1 (T/a-q) + e) -^[(v^V^r/a - q) - e) / v 7 ^) 

if -F r (g, d) > £ r and F$(q, d) > otherwise p(q, d) = 0. 

We now present the proof of the formula for p$(q, d). The proofs of the formulas 
for p T (q, d) and p(q, d) are similar, but simpler. For the sake of brevity, we omit 
these proofs. We use the notation 



1(A) = 



1 if A is true 
if ^4. is false 
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where A is an arbitrary statement. This is similar to the Iverson bracket notation 
(Knuth, 1992). Observe that 

Pt(q, d) = Pr (9 G I B \ Q = q, D = d) 

= Pr (9 G % , F T (Q, D) > i T , Ft(Q, D) < £ ( \ Q = q, D = d) 
= E{X{9ek) l(F T (Q 1 D)>£ T1 F i (Q,D)<£ ! :) \ Q = q, D = d) 
= E(l(6e 7 e ) 1 (F T (q, d) > tr, Fz(q, d) < \ Q = q, D = d) 

by the substitution theorem for conditional expectations (see eg. p. 9 of Bickel & 
Doksum, 1977). Thus 



Pr (9 G 7 e | Q = q, D = d) if F T (q, d) > £ T and F € (q, d) < 



otherwise. 



It follows from (J3J that Pr (9 G 7^ | Q = q, D = d) is equal to 



Pr a T 7 G 



a T G^7 ± t(m + k — 1) 



D + Q T U T W 22 - 1 UQ 

m + k — 1 



Pr a T 7 G 



a T G € 7 ± t(m + k-l)' 



l d + q T U , W 22 - L Ug 

m + k — 1 



Q = q, D = d 

Q = q, D = d 

(9) 



by the substitution theorem for conditional expectations. Since 7 and D are inde- 
pendent random vectors, OH]) is equal to 



Pr a T 7 G 



a T G ? 7 ± £(m + k - 1) 



l d + q'U^W 22 - 1 Uq 



m + k — 1 

Pr (a T 7 G [a T G 4 7 ± ej | Q = q) 

Pr (a T 7 - e ? < a T G^j < a T j + e$ \ Q = q) 



Q = q 



(10) 



Note that the random vectors a 1 G^ and Q have the following multivariate normal 
distribution. 



a T Ga 

Q 



N 



a T 1 -w 21 J W 22 - 1 (£/a) 

T jo 



vf s 21 1 
S21 V22 



Thus, the distribution of a 1 G^ conditional on Q = q is 

iV(a T 7 - w 21 T W 22 -\i/a) - s 21 T V 22 -\r/a - q) , w* - s 21 T V^ 1 s 21 
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Hence, fTTOj) is equal to 

a T 7 - e c - E(a T Ga \ Q 



Pr 



v/Var(aTG^7 I Q = q) 



Pr 



g) < a T G^ 7 - E(a T G^ 7 | Q = <?) 



^Vai(a^Ga\ Q = q) 



< 



a T 7 + e 5 - E(a T G^\ Q 



^Var(a T G 4 7| Q = q 
w 21 T W 22 - 1 (£/a) + a 2 i T V22~ 1 (r/<r - q 




< Z 



< 



S21 T V r 2 2 _1 S 2 l 

M^Wg^^gAO + a2i T V22~ 1 (r/o- - g) + e € 

y/w* - S 2 1 T V22 _1 S21 



where Z ~ iV(0,l), 



which is equal to 
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